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Abstract 

This paper considers the problem of designing real-time inputs into controllable nonlinear 
diffusion processes so as to obtain maximally-informative observations about a parameters of 
interest. To do this, we maximize the Fisher information about the parameters of interest. We 
demonstrate that this problem can be expressed as a problem in optimal control and apply numer- 
ical techniques in stochastic control to solve it. These methods can be applied when the diffusion 
process is observed continuously and without noise. When observations are incomplete, infre- 
quent or noisy we propose applying a particle filter to first estimate the state of the system and 
then apply the estimated control policy using the posterior expectation of the state as an input. 
We demonstrate the effectiveness of these methods in models for simple ecological systems and 
in models of single-neuron experiments. 

1 Introduction 

This paper considers a novel problem in experimental design, specifically the design of inputs to 
nonlinear stochastic systems to improve the estimation of parameters from the observation of such 
systems. There has been considerable interest in recent statistical literature in providing diagnostic 
methods for model improvement (Hooker, 2009; Miiller and Yang, 2010) and in providing meth- 
ods for performing inference in mechanistic, non-linear dynamic systems models, both those de- 
scribed by ordinary differential equations (Brunei, 2008; Ramsay, Hooker, Campbell, and Cao, 2007; 
Girolami, Calderhead, and Chin, 2010; Huang and Wu, 2006) and explicitly stochastic models (Ionides, Breto, and King, 
2006; Ait-Shahalia, 2008; Wilkinson, 2006). However, little attention has been given to the problem 
of designing experiments on dynamic systems so as to yield maximal information about the param- 
eters of interest. In this paper, we use stochastic optimal control techniques to construct dynamic, 
data-driven experimental protocols, which can improve the accuracy of parameter estimates based on 
dynamic experiments. 

To illustrate the application of experimental design we focus on two applied examples. Chemostat 
experiments - described in detail in Sect. 4 - are a laboratory based framework in which to conduct 
controlled experiments in ecology. The chemostat apparatus is a small tank in which an ecology of 
micro-organisms can be developed. In our example, we deal with the simplest such model in which 
algae is grown on a nitrogen-rich substrate. A description of the system dynamics can be given in 
terms of an algal growth rate proportional to both the algal population and the nitrogen available, 
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modulated by a saturation effect (see (6)). Parameters that control the rate of saturation can be partic- 
ularly poorly determined from data; this paper focuses on finding experiments that will minimize the 
variance of parameter estimates. In these experiments, nitrogen-rich substrate is continuously added 
to the chemostat while the chemostat contents are also removed at the same rate. This dilution rate 
represents a real-time input which we will seek to choose adaptively so as to optimize information 
about the half-saturation constant for the system. 

Our second example focuses on single neuron experiments. In such experiments, an electrode is 
attached to a neuron through which the neuron is stimulated by an electrical current. With sufficient 
stimulation the neuron generates an action potential, or "spike." This electrical activity can be mea- 
sured through the same electrode; model parameters can then be inferred from these measurements. 
In this context, we seek to adaptively choose the electrical stimulus so as to provide maximal informa- 
tion about parameters in the Morris-Lecar model (Morris and Lecar, 1981), which is frequently used 
in neuroscience because it is relative simple yet still captures a wide range of dynamical behavior. 

These two systems represent a useful contrast in the quality of data, system behavior and level of 
stochastic variation in system dynamics. Data from chemostat experiments can be gathered at only 
infrequent intervals and are subject to variation due to binomial sampling of the tank and counting 
error. Their dynamics exhibit relatively high stochasticity but tend to an over-all fixed level. By 
contrast, in neuronal recordings, membrane voltage can be measured nearly continuously and with 
very high precision, and the signals exhibit small, but important, stochastic fluctuations on top of 
regular (e.g., oscillatory) behavior. The effect and utility of adaptive control is therefore substantially 
different between the two models. 

Both systems above can be described as multivariate diffusion processes of the form 

dx t = f(x t , 6, u t )dt + ^{xtf^dWt (1) 

where is a parameter to be estimated and u t a time-varying control that can be chosen, x t is the 
vector of state variables and f is a vector valued function. We demonstrate that as the experiment 
progresses, ut can be chosen adaptively to maximize Fisher information about 6. When direct ob- 
servations of the full state vector are available, the "control policy" (i.e., choice of u f ) can be done 
using the tools of stochastic optimal control. When direct observations are not available, we propose 
using a particle filter (Doucet, De Freitas, and Gordon, 2001) to estimate the state variables and ap- 
ply the control that would be chosen if the estimated states were the quantities directly observed. A 
simulation study demonstrates that this approximation still provides improved experimental perfor- 
mance. Alternative models, based on partially observed Markov decision processes are considered 
in Thorbergsson and Hooker (2012); however these are more computationally expensive and further 
limit the systems to which they can be applied. 

To our knowledge, this is the first time that experimental design has been employed for explicitly 
stochastic, fully nonlinear systems. It is also the first time that adapting to observations over time 
has been considered as opposed to establishing pre-set values of the the inputs. Within the context of 
stochastic models, adaptation can be expected to be particularly important; information about param- 
eters is typically maximized at particular parts of state space and the choice of u that will maintain 
a system close to high information regions will depend on the current state of the system and cannot 
be determined prior to the experiment. In related work, experimental design has been considered for 
ordinary differential equation models in Bauer, Bock, Korkel, and Schloder (2000) and the choice of 
subject and sampling times has been investigated in Wu and Ding (2002) for mixed-effects models 
with ordinary differential equations. Huggins and Paninski (2011) consider the optimal placement of 
probes on dendritic trees following linear dynamics. 
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For numerical reasons, we have chosen both systems studied here to represent the simplest pos- 
sible models and experiments in their scientific domain. In particular, they are both planar models, 
i.e., involve only two state variables. The strategy proposed in this paper can potentially be applied 
to systems with more than 2 variables, but systems in more than a few dimensions may become 
computationally prohibitive. Higher-dimensional systems thus require substantially new numerical 
techniques and ideas. This is the subject of on-going work. 

The paper is structured as follows: in Sect. 2, we provide a precise formulation of the dynamic 
experimental design problem and outline our solution strategy for "full observation" problems where 
we can observe the state vector continuously. Sect. 3 discusses our approximate strategy for solving 
partial observation problems. A detailed study of the chemostat model is presented in Sect. 4, and of 
the Morris-Lecar neuron model in Sect. 5. Some final remarks and directions for further research are 
detailed in Sect. 6. 

Sample implementation. We have implemented all the algorithms described in this paper, and used 
the implementation to compute the examples. The source code, all written in R, can be obtained from 

http : / / math .arizona.edu/^klin/rstoc 

2 Maximizing Fisher Information in Diffusion Processes 

In this section we demonstrate that experimental design for inputs into diffusion processes can be cast 
as a problem in control theory and present numerical methods for constructing this design. Sect. 2. 1 
presents the problem and provides an overview of the dynamic programming methods to solve it. In 
Sect. 2.2 we present details of the numerical schemes used while we consider the dependence of our 
design on the parameters of interest in Sect. 2.3. Once the problem is cast in terms of control theory, 
we are able to follow well-developed methods; these are described here but the reader is referred to 
Kushner (2000) for details. 

2.1 Problem formulation 

We begin by assuming the entire state vector is continuously observable in time. We refer to this as 
the "full observation" problem, the solution of which will form the basis for the more realistic but 
challenging "partial observation" case (Sect. 3). 

Consider the multivariate diffusion process (1). Our goal is to estimate 6 using observations of 
x t for t € [0, T], up to some final time T. At each moment in time, we assume the experimenter can 
adjust the control ut using all available information from the past, i.e., x s for s < t . In conceptual 
terms, our problem is to find a control policy: a procedure for choosing a control value u t based 
on past observations such that the resulting estimator of 6 is optimal, e.g., has minimum asymptotic 
variance. We propose below a general strategy for doing this, i.e., given a diffusion model of the 
form (1), our algorithm computes a variance-minimizing control policy. Note that our approach is 
fairly general and can accommodate other notions of optimality, but for concreteness we focus on 
asymptotic variance here. 
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First, recall that when x t is continuously observed, the log likelihood of 9 is given by 



|x) = i I f(x t , 9, u t ) T • 5T 1 (x t ) • f(x t , 9, u t ) dt 
1 Jo 

Jo 



(see, e.g., Rao, 1999). Since we are interested in choosing u t to minimize the asymptotic variance of 
the parameter estimate, we should maximize the associated Fisher information 



1(9, u) = E 



T 



d 

dB 



i(x t ,9,u t ) 



dt 



(2) 



s(x t ) 



with ||z||x; = z T S _1 z. A priori, u t can be any function of past history (x s : s < t) and t, i.e., it is a 
process adapted to x t . 

The variational problem (2), in which we want to find the x t -adapted process ut that maximizes 
1(9, u), is a standard problem in stochastic optimal control. We review the main relevant ideas be- 
low, and refer interested readers to Kushner (1971, 2000); Bensoussan (1992), and references therein 
for details. The dynamic programming methods employed to solve the control problem resemble 
techniques used for sequential designs in clinical trials - see Bartroff and Lai (2010) for recent devel- 
opments - but are employed in a substantially different context. 

An important practical issue is that in applying this strategy, the value of the unknown parameter 
9 is needed for evaluating the Fisher information (2). This and surrounding issues are discussed in 
Sect. 2.3 by incorporating a prior on 9. For the moment, we simply assume that a reasonably good 
initial guess of 9 is available, and is used for the pre-computation of a control policy. Needless to 
say, the computed policy will not be optimal, but provided the initial guess is sufficiently good, the 
variance of the estimator subject to this control policy should still be lower than the uncontrolled 
system. 

Brief review of dynamic programming. A basic fact of optimal control theory is that because the 
process x t is Markovian and the targeted information can be expressed as an integral, the optimal 
control u t depends only on the current state x t rather than entire past history of x. That is, there 
is a function F : M. d x [0, Tfi na i] — > U, where M. d is the state space, T^ na i is the duration of the 
experiment, and U is the set of permissible control values, such that ut = F(x t ,t) gives the optimal 
control policy. In other words, the controlled process is also Markovian. The main computational 
problem is to find F given an equation of the form (1). This can be done by combining two ideas: 
Markov chain approximations, which we discuss in Sect. 2.2, and Bellman's dynamic programming 
algorithm, which we describe below. 

First, for simplicity, suppose the diffusion process (1) has been approximated at discrete times 
ti = iAt, i = 1, . . . , T, yielding 

x i+1 =x i + i(x i ,9,u i ) At + VAt^ 1/2 (xi)ei 

with the €i independent vectors of independent normal random variables, and approximate the Fisher 
Information by 

d 2 
-f(Xi,9,Ui) At. (3) 



FI(8)=Y J E 



i=l 



d9 



S(Xi) 
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Our goal is now to choose Ui at each step to maximize FI{9) over the whole process. 

To achieve this, we use the dynamic programming algorithm, which solves the optimal con- 
trol problem on progressively longer timescales by starting at the final timestep and iteratively step- 
ping backwards in time, calculating the optimal control policy at each time while accounting for 
the (already-calculated) policy that will be employed in the future. Specifically, define the Fisher 
Information To Go (FITG) at each timestep i by (5) 



FIi(0,Xi) = max £x. +1 | Xi , u . 



FL 



S(Xi) 



(4) 



where the conditional expectation Ex i+1 \x ilUi (-) averages over all possible values of the new step 
Xj+i, given the current state Xj and control Uj. (Note that choice of Uj affects not only the second term 
inside the expectation, but also the first term since choice of V4 affects Xj + i.) Observe that if V4 were 
chosen to maximize FIi(6, Xj) at each step, i.e., 



Ui = F(xi,iAt) = argmax £ Xi+1 | Xi;U 



FI 



+ 



7w 



f(xi,e,u) 



S(Xi) 



(5) 



then the Fisher information FI(6) in Eq. (3) will be equal to E\ ( FIq(6, xq 



Ex [•} averages over the initial conditions. That is, at each step i, choose u 



, where the expectation 
= F(xi,iAt) according 

to Eq. (5); since we can assume that FIj(9,x) has already been computed for all j > i, we need 
only optimize the Fisher information with respect to Ui , which is typically straightforward to do (see 
below). 

The procedure iteratively constructs a control policy Fi for each time step i; one can also modify 
the argument to show that the optimal control depends only on the present state and not on the entire 
past history. 

Optimization over U. The optimization over it, required at each step may be done in a number of 
ways, depending on the form of the allowed set of control values U: if U is, say, an interval or an 
open region in W 1 , derivative-based optimization techniques like Newton-Raphson may be useful. 
In the present paper, we assume the control values are drawn from a finite set U (this is a version 
of so-called "bang-bang" control); the optimization in Eq. (5) is thus done by picking the maximum 
over this (small) finite set. 

In general, special care must be taken when the control value set U is unbounded or very large, 
as (5) may be maximized at unrealistic values of m in such situations. As an example, consider the 
univariate Ornstein-Uhlenbeck process with additive control 



dx = {-fix + u)dt + adW t 



where at step T — 2, 



u T _ 2 = argmax E Xti | Xt _ 2)Ut _ 2 x t _ 1 



((1 - At[3) x T _ 2 + Atu T - 2 ) + Ata 2 



which is maximized for u = ±00. A common way to ensure that solution to the control problem 
remain feasible is to add a penalty to the Fisher Information, so that (5) has a finite, well-defined, 
solution. In this case, allowing At — > results in ut being given in terms of the solution of the 
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Hamilton- Jacobi-Bellman partial differential equations (see Kushner, 1971). For the bang-bang con- 
trols considered in this paper, where the range of possible values u is restricted, such a penalty is not 
needed. But the evaluation of the integral in (4) then becomes analytically intractable, necessitating 
the numerical approximations we used. 

2.2 Markov chain approximation and implementation details 

Following Kushner (1971), we construct a finite-state Markov chain to approximate our dynamical 
system. The basic idea is to cover the domain with a regular grid whose vertices are the states of 
the approximating Markov chain, and to choose transition probabilities via a finite difference dis- 
cretization of the generator of the process. In our implementation, we employ a "split operator" finite 
difference discretization. Roughly speaking, this amounts of discretizing the drift and diffusion terms 
in the SDE (1) separately, then composing the resulting difference schemes. This is a standard tech- 
nique often used for the numerical solution of partial differential equations (see, e.g., Strikwerda, 
1989). In our test examples, the use of such a split operator scheme permits us to use larger grid 
spacings and timesteps while maintaining numerical stability, and can significantly speed up overall 
running speeds. 

First, suppose for simplicity that the state space in Eq. (1) is R. For each h > 0, let Gh denote the 
regular grid KL with spacing h. For the moment, assume Gh extends to all of R; boundary conditions 
are discussed below. The grid Gh will be the state space of our discrete-time approximating Markov 
chain Xj. For reasons that will become clear later, the timestep cannot be arbitrary. Instead, we fix a 
constant \i > and an integer r > 0, and set the timestep to be At h = fih 2 . Then our approximation 
can be described as follows: 

Step 1. Suppose at time t = iAt h , the chain has state Xj = sh. Then 

- with probability \ f(sh)\ ■ At h /h, jump to (s + l)h if f(sh) > and to (s - if 
f(sh) < 0; and 

- stay at sh with probability 1 - \f(sh)\ ■ At h /h . 

Step 2. Let s'h denote the state of the chain after the previous step. Then jump to (s'±r)h 
with probability ^H(s' h) p, / r 2 , and stay at s'h with probability 1 — H(s'h)fi/r 2 . 

This discretization scheme treats the drift and diffusion terms in Eq. ( 1 ) separately, whence the 
name "split operator." Both conceptually and in terms of programming, split operator schemes are 
often easier to work with. They typically impart an additional variance on the motion of the approx- 
imating Markov chain, on the order of 0(hAt h ) = 0(h 3 ) per step; the relative effect of this extra 
variance (so-called numerical diffusion) vanishes as h — > 0. 

Clearly, in order for the various expressions above to be valid transition probabilities, we must 
have p/r 2 < inf^gig S(x)" 1 . We also need sup,,, \f(x)\ At h /h < 1. Since At h /h = \ih, the second 
condition can always be achieved by taking h sufficiently small. The first condition, however, places 
a rather stringent constraint on the timestep At h . The purpose of introducing the "skip factor" r is 
precisely to partially alleviate this constraint, at the cost of losing some accuracy. In practice, going 
from r = 1 to r = 2 can have a large impact on the overall running time. If r = 1, our scheme closely 
resembles the "up-wind" scheme of Kushner (Kushner, 1971); a scheme similar to our r = 2 case is 
also described there. 
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In practice, rather than choosing h and fi, we usually first choose h and At, and take r oc T, max ^JJL 
with a constant of proportionality between 1 and 2, where T, max is the maximum of over the 
domain of interest. This guarantees that the scheme converges (see below). We usually use relatively 
coarse grids, as (i) the structure of the optimal control policy is not too fine, and (ii) we usually operate 
in the presence of observation and dynamical noise, so there is not much point trying to pin down fine 
structures in the control policy. 

Convergence. In general, a sufficient condition for the Markov chain approximation method to be 
valid is that the approximating chain Xj satisfy (Kushner, 1971) 

Ei(x i+1 - in) = f(xi, Ui )M + 0(h a At) 
Var;(x i+1 - Xi) = S(xi)At + 0(h a At) 
\x i+ i - Xi\ = 0(h) 

In the above, Ei(-) denotes conditional expectation given all information up to step n, Varj(-) denotes 
the corresponding autocovariance matrix, and a > is a constant. Note the last line should be 
interpreted to hold surely, i.e., it says the discrete Markov chain makes jumps of 0(h) in size. If 
the conditions above are satisfied, then as h — > 0, the optimal control policy for the approximating 
chain will converge to an optimal control policy for the diffusion (1). It is easy to see that our scheme 
satisfies the convergence criteria above with a = 1 . 

Boundary conditions. In practice, Qh must be a finite grid. We assume that it spans a subset of R d 
sufficiently large that on the timescale of interest, trajectories have very low probability of exiting. 
We then impose that when the approximating Markov chain attempts to exit the domain, it is forced 
to stay at its current state. This gives a simple way to obtain a Markov chain on bounded grids. 

Higher dimensions. The generalization of the preceding scheme to higher dimensions is straightfor- 
ward: we simply treat each dimension separately and independently. 



2.3 Parameter Dependence and Priors 

Throughout the discussion above, we have constructed the Fisher Information 1(9, u) and the result- 
ing control policy u(x, t) for a known 9. In general, 9 will not be known - there would otherwise 
be little point in the experiment - and both Fisher Information and the optimal control policy may 
depend strongly on its value. 

We address this issue by constructing a prior ir(9) over plausible values of 9 and maximizing 
EqI(9, u). Note here that careful attention must be payed to the choice of prior here. The dynamics of 
x may change rapidly depending on the value of 9 and only some dynamic regimes will be plausible. 
In these cases, the chosen u may also yield poor results. Numerically, we implement a prior by 
averaging (2) over a grid of 9 and note that all the methods described below can readily accommodate 
this operation. 

We note that while averaging the objective over possible values of the parameter is a natural pro- 
cedure in some situations, e.g., when the optimal control policy is not too sensitive to the parameter 
9, it may not always be the optimal strategy. When the optimal control policy varies significantly 
with 9 and an estimate of parameters can be cheaply obtained and updated as the experiment pro- 
gresses, it may be preferable to use the control policy associated using these on-line estimates, e.g., 
by pre-computing a set of control policies (for a grid of parameter values) and use the current best 
parameter estimate to choose a control policy. Alternatively, as the experiment progresses, Fisher 
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Information could be averaged over a posterior for the parameters. These ideas are explored in 
Thorbergsson and Hooker (2012) in the context of discrete-state Markov models, but will impose 
a substantial computational burden in more complex systems and have not been explored here. 



3 Partial Observations 

The algorithms discussed in Sect. 2 are appropriate for a model in which all the variables in x t are 
observed continuously over time without error. This is rarely the case in practice. For example, in our 
chemostat problem, there is no direct means of assessing the nitrogen content of the the chemostat. 
Algae are measured by placing a slide with a sample taken from the chemostat in a particle counter, at 
most every few hours. These observations are therefore subject to both counting and sampling errors. 
In the Morris-Lecar neural model, only voltage can be measured, although with high accuracy and 
frequency. Thus, in typical applications, not all state variables will be measured, measurements will 
be taken at discrete times and are likely to be corrupted by noise. This results in two problems for 
the strategy outlined in Sect. 2.1: first, since not all state variables are being measured, the dynamic 
program above cannot be directly employed 1 and second, that the Fisher information employed as a 
target of the dynamic program is no longer representative of the variance of parameter estimates. 

To overcome the difficulties mentioned above, we propose to a general approximation strategy 
using particle filters (see, e.g., Liu, 2008). Specifically, we first solve the full observation problem by 
finding the control policy P that minimizes the Full-Observation Fisher Information (FOFI) given by 
(2). During the actual experimental run (in which we only have noisy measurements of some state 
variables), we use a particle filter to estimate the missing state variables, which we then plug into the 
FOFI control policy T . We argue that this approximation will be reasonable when the particle filter 
is accurate in the sense that the conditional variation of the process given observations of it is small. 

Beginning with the approximation to Fisher information, within a partially observed system, the 
complete data log likelihood can be written as 



l(0\x,Y) = J2^gp(YMt i )) + l(0\x) 



i=i 

for an observation vector Y = (Y\, . . . ,Y n ) assuming that 9 does not appear in the observation 
process. From the formulation for the observed Fisher Information 

I(9\Y) = E xlY ^l(6\x, Y) - var X | y Z(#|x, Y) 
the (expected) Fisher Information for the partially observed diffusion process can be written as 



I Y (0,u) =E x 



d 



dO 

E Y var x \Y 



dt 



E(X t ) 



f(x t ,6,u t 

J ^f(x t , 9, n t )S(x t )~ 1 (dx t - f(x t , 9, u t )dt) 



'There is a fundamental mathematical reason for this: projections of Markov processes are typically not Markov, and 
without Markovianity we cannot assume that the optimal control is a function of the present state alone. Indeed, typically 
this is not the case. 
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where we observe that the first term is the objective of our dynamic program. We observe that the 
second term shrinks with var(dx|y) assuming that di/d6 is continuous in x. 

The argument above indicates that, provided the conditional variance of dx given observations 
Y is small, maximizing FOFI will provide improved information about 9. It remains necessary to 
demonstrate that employing the filtered estimate will approximately maximize FOFI. For linear sys- 
tems, dxt = {Ax t + Butjdt + dWt, with a quadratic objective function J xJC{t)x t dt, the separation 
theorem guarantees that this strategy is, in fact, optimal (see Kushner, 1971). For this linear system, 
the Fisher Information about parameters in A is given by f \\{dA/ d6)x t \\ 2 dt for which the proposed 
strategy is optimal. 

This need not be the case for nonlinear systems or non-quadratic functions. Extensions of the 
separation theorem have been developed in Kilicaslan and Banks (2009) based on successive approx- 
imations of (1) in terms of a linear system with time- varying parameters. Particle filter methods can 
also be employed to average the future cost over the current distribution of the estimated state vari- 
ables (Andrieu et al., 2003). Both of these schemes require re-running the dynamic program at each 
time point; this will not always be computationally feasible. 

In contrast, the approach of merely using the filtered estimate of the state allows the map from 
state variables to controls to be pre-computed. This was employed in Botchu and Ungarala (2007), 
for example and we demonstrate here that it is still helpful for estimating parameters. As above, if 
the system is such that a bound can be placed on | |x* — x| | where dx* is the filtered estimate, under 
suitable regularity conditions this procedure will yield approximately optimal results. The details of 
conditions for this to hold will depend on context-specific factors such as whether the control policy 
takes discrete or continuous values and we do not attempt to fill these in here. 

Implementation details 

Particle filters. Since our observations are linear, at each observation step the posterior distribution is 
Gaussian. We can therefore sample exactly, without the need for weights or resampling / bootstrap- 
ping. 

Infrequent observations. In practice, we can use the past information to extrapolate the state trajec- 
tory between observations, and use a time-dependent control. For both numerical and experimental 
implementation, it is far simpler to hold the control value constant between observations. 

Variance-reduced maximum likelihood estimation. We assume flat priors over a given parameter 
interval / , and discretize / into a discrete grid In ■ For each parameter value in In , we run a particle 
filter and compute an estimate of the log-likelihood using the formula 2 



where {xj : i = 1, • • • , N} denotes the ensemble of particles at time t . If the computed log- 
likelihood curve takes on a maximum in the interior of the interval I , we do a standard 3 -point 
quadratic interpolation around the maximum and find the maximizer; this is the MLE. On the rare 
occasions when the log-likelihood curve takes on a maximum outside the interval / , we use the 
corresponding endpoint as the estimate. 

2 See, e.g., the Supplemental Information for Ionides et al. (2006). 
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The particle filters for different parameter values are driven by the same sequence of gaussian 
random numbers. This same-noise coupling or method of same paths reduces the variance of the 
resulting estimates (Asmussen and Glynn, 2008). 



4 Example: Chemostat Growth Models 

In chemostat experiments a glass tank (chemostat) is inoculated with a population of algae. The 
tank is bubbled to ensure that the tank contents are mixed and to prevent oxygen deprivation and 
nitrogen-rich medium is continuously supplied while the contents of the tank are evacuated at the 
same rate as the medium is injected. Algae are assumed to consume nitrogen in proportion to their 
population size and nitrogen concentration up to a saturation effect; they reproduce in proportion to 
their consumption. For ecological models a set of stochastic differential equations can be proposed 
with drift term 



where the model has a mechanistic interpretation for an infinite population given in terms of: 
N (ji mol/1) represents the nitrogen concentration in the chemostat 
C (10 9 cells per liter) gives the relative algal density 

6(t) (percentage per day) is the dilution rate; i.e., the rate at which medium is injected and the 
chemostat evacuated 

Vi (*) (P mol/1) is the nitrogen concentration in medium 

p (p, mol/10 9 cells) is the rate of algal consumption of nitrogen 

k (p, mol/1) is a half-saturation constant indicating the value of N at which N/ (re + N) is half-way 
to its asymptote 

X (10 9 cells//i mol) is the algal conversion rate; how fast algae turn consumed nitrogen into new 
algae. 

This system is made stochastic by multiplicative log-normal noise. This is equivalent to a diffusion 
process on N* = log(N) and C* = log(C) with additive noise: 



Here, the diffusion terms provide an approximation to stochastic variation for large but finite popu- 
lations and account for other sources of extra-demographic variation. While alternative parametriza- 
tions of stochastic evolution can be employed, the diffusion approximation is convenient for our 
purposes. A typical goal is to estimate the algal conversion rate \ an d the half-saturation constant k. 
Within this experiment there are a number of experimental parameters that can be chosen: 




(6) 
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(a) A sample path (b) Fixed point vs k 

Figure 1. The chemostat model. In (a), we show a sample phase space trajectory for the uncontrolled 
system; the two smooth curves are the nullclines. Here, 5 = 0.3 and k = 4.4 . Panel (b) shows the 
movement of the fixed point as k varies from 2 to 12, with 5 = 0.3 . 

• The dilution rate of the chemostat 6(t) 

• The concentration of nitrogen input rjj (t) 

• The times at which the samples are taken from the chemostat 

• The quantities that are observed. 

In this paper we focus on the optimization of 5(t), which fits well within the design methodol- 
ogy outlined above. Realistic values for the parameters, estimated from previous experiments, are 
k) = (160,270,0.0027,4.4). We chose o\ = 02 = 0.1 based on visual agreement with 
past experiments; since these act multiplicatively on the dynamics it is reasonable for them to be of 
the same order of magnitude. . We have held rji(t) constant; within the experimental apparatus, rji(t) 
can only be modified by changing between discrete sources of medium. Chemostats are typically in- 
oculated with a small number of algal cells and the models above give algal density relative to a total 
population carrying capacity in the tens of millions. Under these circumstances, initial conditions can 
be given by iV*(0) = log 77/ (the input concentration) and C*(0) oc log 10~ 5 representing an initial 
population of a few hundred cells. 

Our interest here is to optimize 5(t) for the purpose of estimating k . In this model system, one 
can measure both C and N, though nitrogen measurements are both more costly to make and prone 
to distortion by various factors. In addition, continuous measurements are not possible; one can at 
best make measurements a few times a day. Here, we focus mainly on the complete-but-infrequent 
observation regime, where we assume we can make (noisy) measurements of both state variables. We 
also give some results for the partial observation case, in which we try to estimate k from algal counts 
alone. 
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In this system, Fisher Information is given by 

d 



which is maximized when N = k but for C — > oo. Moreover, the experimental apparatus places 
further constraints on the problem; the dilution rate must be positive and cannot exceed the maximal 
rate at which medium can be pumped through the chemostat. 

Examining this system when there is no noise, i.e., o\ = er 2 = , the phase space is fairly simple: 
it has a single stable fixed point given by 

It is easy to check numerically that this fixed point attracts almost all initial conditions in the relevant 
part of phase space. When <j\ , 02 > , all trajectories eventually converge to a neighborhood of the 
fixed point, and the stationary distribution is concentrated around the fixed point for moderate a\, 02. 
See Fig. 1(a) . The fixed point exists only if the dilution rate 5 is not too high; when 5 > 5 C for some 
critical 5 C , the fixed point (Cq , Nfi) moves to —00 . Physically, this means the dilution rate is so high 
that the algal population crashes to 0. For k w 4 (the typical value we assume in this section), the 
critical 5 C is roughly 0.7 . So we will keep the dilution rate below this number. 

As mentioned above, C* is easier to measure experimentally because it can potentially be done 
by an optical particle counter. Unfortunately, in the parameter regime of interest, it is actually fairly 
difficult to infer k from C* measurements alone. To see this, observe that for 5 G [0.1, 0.6] and as k 
varies over [2, 12] , the fixed point moves essentially vertically; see Fig. 1(b). Since trajectories are 
expected to stay near the fixed point, it follows that changing k will have a significant, observable 
impact on the N* dynamics but relatively little on C* . Equivalently, because our estimation strategy 
is based on the likelihoods of state variable observations, A^* measurements (or combined (C*, N*) 
measurements) will be much more useful for estimating k than C* alone. This is our reason for 
focusing on the complete-observation case here. 



Simulation results. In our simulations, we start at (C*,N*) = (—4,2), corresponding to typical 
experimental conditions. With this initial condition, it typically takes a trajectory about 7 days or so 
to reach equilibrium. In our simulations, we carry out experiments of duration T for various values 
of T, ranging from 4 to 30 days. 

For simplicity and for the purposes of calibrating later results, we begin with the results of ideal- 
ized, "full observation" runs. This means we assume we can observe both state variables continuously 
without measurement errors, and that the control value can be updated continuously in time. The op- 
timal control policies, for different values of T, are shown in Fig. 2. The control values are drawn 
from the set {0.1, 0.3, 0.5, 0.68} (remaining below the dilution rate at which all the algae is eventu- 
ally removed from the system), and we assume a flat prior for k over the interval [3.5, 5.5] . As can 
be seen, the computed control policy is nontrivial and depends on T . However, we observed that 
after about T pa 10, the control policy stops changing. The "long-time," steady-state control policy 
uses only the extreme values 0.1 and 0.68; on shorter timescales, the control policy uses all available 
values. 

Fig. 2(d) shows the quantity FI{T)/T, where FI{T) is the mean Fisher information (averaged 
over experimental realizations and initial conditions) as a function of duration T . As can be seen, by 
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Figure 2. Optimal control policies. In (a-c), we show the optimal control policies for different durations 
T. Panel (d) shows FI(T)/T, where FI(T) is the mean "pay-off," i.e., Fisher information averaged over 
initial conditions and experimental realizations. 
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(a) A controlled trajectory (b) The control u t and N* (t) 
Figure 3. Controlled trajectories of the chemostat system (6). 
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T 


Control 


N 


In-range 


Bias 


Std. Dev. 


Std. Dev. Err. 


4 


Dynamic 


256 


138 


0.1551 


0.7837 


0.03464 


4 


0.1 


256 


158 


0.2577 


0.7157 


0.03163 


4 


0.3 


256 


66 


0.1137 


0.9043 


0.03996 


4 


0.68 


256 


42 


0.2243 


0.938 


0.04145 


7 


Dynamic 


256 


256 


0.002351 


0.0395 


0.001746 


7 


0.1 


256 


256 


-0.001245 


0.04593 


0.00203 


7 


0.3 


256 


89 


0.2788 


0.8459 


0.03738 


7 


0.68 


256 


46 


0.06275 


0.9384 


0.04147 


15 


Dynamic 


256 


256 


0.001752 


0.00955 


4.221e-4 


15 


0.1 


256 


256 


0.002518 


0.01284 


5.674e-4 


15 


0.3 


256 


256 


-8.25 le-4 


0.01969 


8.704e-4 


15 


0.68 


256 


48 


0.04579 


0.9314 


0.04116 


30 


Dynamic 


256 


256 


0.001827 


0.004443 


1.964e-4 


30 


0.1 


256 


256 


0.001629 


0.005766 


2.548e-4 


30 


0.3 


256 


256 


0.002547 


0.01075 


4.752e-4 


30 


0.68 


256 


55 


0.1324 


0.9216 


0.04073 



Table 1. Full observation results. We perform TV independent trials for each experimental set-up. in-range 
= number of trials where estimator falls within uniform prior; bias = difference between the mean over tries 
and the true value; std. dev. = standard deviation of the estimator; and std. dev. err. = our estimate of the 
standard error of the standard deviation. 

T = 30, the system has begun to approach (but not yet reached) the asymptotic regime, in which we 
expect FI(T) oc T . To illustrate the effects of the optimal control, a controlled trajectory is shown 
in Fig. 3(a). The corresponding control values are shown in Fig. 3(b), as is the corresponding values 
of N*(t) (not much of interest happens to C*(t)). 

To assess the quality of k estimates, we estimated the estimator variance in simulations. Table 1 
shows the results. As can be seen, the optimal control performs significantly better than most constant 
control values, for a range of T values. However, for every T, the constant control with S m 0.1 
achieves close to the performance of the dynamic control. The reason for this is that in this system, 
the dynamics are dominated by a single stable fixed point. Since the control policy is piecewise 
constant, if the trajectory is asymptotically near the fixed point, then the control will be essentially 
constant. Note that these procedures are also an indirect means to identify optimal control settings 
for the experiment in this situation. 

It follows that dynamic control only makes sense if we are sufficiently far away from the fixed 
point. This can be either during transients, or if the dynamic noise is strong enough to push the 
trajectory away from the fixed point. By "sufficiently far," we mean relative to the size of the region 
over which the control is constant, e.g., one may need to take a finer discretization of the control 
value space (or make it continuous) in order for dynamic control to be very different from the optimal 
constant control. 

Next in Table 1, we turn our attention to complete-but-infrequent observation runs. Here, we 
observe the system twice a day (i.e., at intervals of At = 0.5). Observation noise is (0.025,0.025). 
Naturally, all the standard deviations are larger. The overall trends observed above persist, and for the 
same reasons. 
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T 


Control 


N 


In-range 


Bias 


Std. Dev. 


Std. Dev. Err. 


4 


Dynamic 


256 


104 


0.2398 


0.8295 


0.03666 


4 


0.1 


256 


125 


0.1696 


0.7967 


0.03521 


4 


0.3 


256 


57 


0.1024 


0.9191 


0.04062 


4 


0.68 


256 


45 


0.06094 


0.9385 


0.04148 


7 


Dynamic 


277 


277 


0.01524 


0.1333 


0.005665 


7 


0.1 


256 


256 


-0.002385 


0.1416 


0.006259 


7 


0.3 


256 


76 


0.2187 


0.8772 


0.03877 


7 


0.68 


256 


31 


0.06347 


0.9587 


0.04237 


15 


Dynamic 


256 


256 


-0.001966 


0.06015 


0.002658 


15 


0.1 


256 


256 


-1.926e-4 


0.05113 


0.00226 


15 


0.3 


256 


256 


0.005561 


0.08012 


0.003541 


15 


0.68 


256 


50 


0.1175 


0.9288 


0.04105 


30 


Dynamic 


331 


331 


-6.84e-4 


0.03479 


0.001352 


30 


0.1 


256 


256 


-6.979e-4 


0.03149 


0.001392 


30 


0.3 


256 


256 


0.005991 


0.04461 


0.001972 


30 


0.68 


256 


65 


0.04572 


0.9107 


0.04025 



Table 2. Complete but infrequent observation results. See Table 1 for key. 



T 


Control 


N 


In-range 


Bias 


Std. Dev. 


Std. Dev. Err. 


1 


Dynamic 


256 


184 


3.91 


3.175 


0.1403 


7 


0.1 


256 


215 


3.983 


2.522 


0.1114 


7 


0.3 


256 


179 


3.816 


3.116 


0.1377 


7 


0.7 


256 


142 


3.755 


3.707 


0.1638 


15 


Dynamic 


256 


188 


3.387 


3.288 


0.1453 


15 


0.1 


256 


225 


3.768 


2.428 


0.1073 


15 


0.3 


256 


199 


3.642 


2.999 


0.1325 


15 


0.7 


256 


166 


3.743 


3.494 


0.1544 


30 


Dynamic 


256 


216 


3.746 


2.678 


0.1183 


30 


0.1 


256 


209 


3.877 


2.632 


0.1163 


30 


0.3 


256 


196 


3.881 


3.006 


0.1328 


30 


0.7 


256 


198 


3.754 


3.045 


0.1346 



Table 3. Partial observation results. 
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Finally, to confirm our intuition, we show the results of partial observation runs in Table 3, where 
we observe only C* . As expected, we cannot estimate k very well at all: many of the estimates fall 
outside the prior range. Nevertheless, if we measure the estimator performance by the number of 
estimates that fall inside the prior range, we see that the controlled system is comparable to the best 
constant control. Note that to accommodate the larger variance, we enlarge the prior range to [2, 12] 
in this set of simulations. 



5 Example: Morris-Lecar (ML) Neuron Model 

In the previous section, we have seen that our optimal control procedure correctly identifies "information- 
rich" state space regions for the chemostat problem, and yields a control policy that drives state tra- 
jectories to that region. Since the chemostat model state space is dominated by just a single fixed 
point, a natural question is how our method fares in more complex dynamical situations. In this sec- 
tion, we examine a model with very different dynamics, namely the Morris-Lecar (ML) neuron model 
(Morris and Lecar, 1981). 

The ML model is often used in computational neuroscience for modeling membrane voltage 
oscillations and other properties of neuronal dynamics; it describes how membrane voltage and ion 
channels interact to generate electrical impulses, or "spikes," which is the primary means for neuronal 
information transmission. The ML model is planar, and hence more amenable to analysis than higher- 
dimensional models like the Hodgkin-Huxley equations (Hodgkin and Huxley, 1952). At the same 
time, it faithfully captures certain important aspects of neuronal dynamics (Ermentrout and Rinzel, 
1998), making it a commonly-used model in computational neuroscience. It has a rich bifurcation 
structure, and exhibits two dramatically different timescales. A good reference for this model is 
Terman and Ermentrout (2010); another is Ermentrout and Rinzel (1998). 

The ML model has the form 

C m v t =h ~ 9i (v t - Ei) - g K w t (v t - E K ) - g Ca modvt) (v t - E Ca ) + PvW^ 

w t =<f> — h p w j(v, w)W t (8) 

T w {v t ) 

where the auxiliary functions Woo, t w , and moo are given in Appendix A. In the equations above, the 
I t term represents an externally applied current which we treat as the control variable. The variable 
vt is the voltage across the cell membrane; the first equation above is just Kirchoff 's current law. The 
(dimensionless) variable wt is a "gating variable": it is a number in [0, 1] that describes the fraction of 
membrane ion channels that are open at any time; the second equation can be interpreted as the master 
equation for a two-state Markov chain describing the opening and closing of ion channels. The terms 
with (3 V and f3 w ~f(v, w) are independent noise terms modeling different sources of noise: the ft v term 
models voltage fluctuations, and is simply additive white noise. The f3 w j(v, w) term models random 
fluctuations in the number of open ion channels due to finite-size effects; here we use 

7 («, w) = l-!L-( Woo ( v ).(i-2w)+w) , (9) 

V T W {V) V / 

which arises from an underlying Markov chain model 3 . 

3 This can be derived via a limit theorem ofTom Kurtz (Wang, 2011;Kurtz, 1971, 1981). See also Smith (2002). 



16 



-SO -60 -40 -20 20 40 60 -60 -40 -20 20 40 

V V 



(a) Noiseless ML phase portrait (b) Controlled and uncontrolled trajectories 

Figure 4. ML phase portrait of a stimulated neuron. Time is measured in ms, voltage in mV. In (a), the phase 
portrait of the noiseless ML system is shown. A stable limit cycle surrounds an unstable cycle, which in 
turn surrounds a sink. In (b), two trajectories for the noisy system are shown: the dotted curve is a trajectory 
without control, while the solid curve is a controlled trajectory. 

Here, we assume a typical experimental set-up in which an electrode (a "dynamic clamp") is at- 
tached to the neuron, through which an experimenter can inject a current I t and measure the resulting 
voltage vt, the gating variable wt is not directly observable. The electrode is usually attached directly 
to a computer, which records the measured voltage trace and generates the time-dependent injected 
current, making this type of experiment a natural candidate for our method. We note that unlike 
the case of the chemostat model, it is possible to measure neuronal membrane voltages to very high 
precision (with signal-to-noise ratios of 1000 to 1 or more). On the other hand, dynamical events of 
interest can take place on timescales of milliseconds or less. So for this application it is vital that 
one pre-computes as much as possible - there is not enough time for extensive on-line computation, 
though there may be sufficient time for an efficiently-programmed state estimator. 4 

We focus on a parameter regime in which the noiseless system can switch between having a 
globally-attracting fixed point and an unstable fixed point surrounded by a stable limit cycle (Fig. 4(a) 
illustrates the latter). The limit cycle represents a "tonically" (periodically) spiking neuron; the effect 
of noise is to "smear out" this limit cycle. The precise parameters we use come from Terman and Ermentrout 
(2010); they are summarized in Appendix A. The key parameter here is the injected DC current; in 
Fig. 4(a) this is / = 100.0 . As one decreases / from this value, the unstable fixed point becomes 
stable via a subcritical Hopf bifurcation, and an unstable cycle emerges. Post-bifurcation, the system 
is bistable: while tonic spiking continues to be viable, a quiescent state (corresponding to the stable 
fixed point) has emerged. If we decrease / even further, the stable and unstable cycles collide in a 
saddle-node bifurcation, leaving behind a single stable fixed point. 

In this example, our control is ut = I(t)/C m . The control values we use are u € {0, 3.5, 5} , 
corresponding to I G {0, 70, 100} (we take C m = 20). The three values of the injected current place 
the system in the stable fixed point, bistable, and limit cycle regimes, respectively. 

4 In real experiments, one may need to use a computationally cheaper state estimation technique, e.g., an ensemble 
Kalman filter, in place of the particle filter. Since we are interested in the potential utility of dynamic control, we continue 
to use particle filters here. 
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(a) (b) 
Figure 5. Control policy for gc a - In (a), the FI is shown. In (b), the control policy. 

Simulation results. We have performed simulations in which we tried to estimate the calcium con- 
ductance constant gc a this is chosen over other parameters because it gives rise to a nontrivial 
control policy (as we show below). In detail, we assume a flat prior for gc a on the interval [4, 5]; in 
all experiments reported below, we fix a randomly-generated gc a value of 4.41498308, which we take 
to be the "true" value. The optimal control is computed using a timestep of « 2 ms, and we assume 
that measurements are available at every time step. State space is discretized by cutting the region 
[—80, 80] x [0, 1] into 72 x 72 bins. Unless otherwise stated, "constant control" means u t = 3.5 . 

Fig. 4(b) shows examples of controlled and uncontrolled (with u = 5, or I = 100) trajectories: 
the perturbations are measurable, but not large. The resulting estimator statistics for estimates of gc a 
using simulated trajectories of duration T = 1000 ms are: 



Observation 


Control 


Mean 


Bias 


Std. Dev. 


Full 


Dynamic 


4.416 


6.315e-4 


0.01106 


Full 


3.5 


4.409 


-0.005686 


0.05045 


Noisy partial 


Dynamic 


4.413 


-0.002062 


0.01579 


Noisy partial 


3.5 


4.416 


5.43e-4 


0.05685 



"Full observation" runs use information from the entire state-space trajectory. As can be seen, the 
dynamic control improves the estimator variance by a significant amount. Just as in the chemostat 
example, there is a measurable bias in the estimates, which one can correct by simulation. 

For comparison, the table below shows the results using different values of constant control: 



Observation 


Constant control 


Mean 


Bias 


Std. Dev. 


Noisy partial 


0.0 


4.413 


-0.001552 


0.06893 


Noisy partial 


3.5 


4.416 


5.43e-4 


0.05685 


Noisy partial 


5.0 


4.416 


0.001458 


0.01847 



Clearly, dynamic control outperforms constant control in reducing estimator variance . To better 
understand why, we examine the structure of the control policy and Fisher information. Fig. 5(a) 
shows the Fisher information function for gc a > which is easily shown to be proportional to 

[moo(v) (v - E C a)] 2 ; 
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Figure 6. Trajectories of the Morris-Lecar model under constant control (dashed) and optimal control 
(solid). 

the controlled trajectory is superposed. As can be seen, the controlled trajectory starts at the resting 
voltage (around -80 mV), jumps toward mV, runs along the "information-rich" strip around v = 20 
before dipping back toward the resting voltage. In comparison, the uncontrolled trajectory in Fig. 4(b) 
appears to spend less time in this region. Fig. 5(b) shows how the optimal control policy accomplishes 
this by suitably pushing the trajectory at critical parts of its cycle. 

To see more clearly the effect of the dynamic control, we plot time traces of the controlled and 
uncontrolled trajectories in Fig. 6. As can be seen, the dynamic control not only tries to push the 
trajectory into the information-rich region, it also makes sure it spends more time there. 

Finally, we note that the optimal control depends a great deal on the parameter being estimated. 
If one were to try to estimate C m , for example, the resulting control policy is essentially constant for 
most of phase space. 

6 Discussion 

There has been increasing interest in combining experimental data and statistical methods with mech- 
anistic dynamical models describing system behavior. This paper adds to this literature in considering 
the problem of designing inputs into such experiments that are directed at improving the accuracy of 
parameter estimates that result from them. In particular, we have demonstrated that in diffusion pro- 
cesses, maximizing Fisher Information about a parameter can be cast as a problem of optimal control 
and shown that using this strategy can substantially improve parameter estimates. In order to make 
control methods feasible when systems are observed with noise or only some state variables are ob- 
servable, we employ the strategy of estimating the value of the state on-line and using this within a 
pre-computed control policy. From our experiments, the use of these methods shows highest impact 
in systems that exhibit cyclic behavior rather than fixed points and when the filtered estimate of the 
state is most accurate. 

There remain numerous unresolved problems and open areas in which these methods can be 
extended. Computational cost, both in speed and memory, represent the largest limiting factor in 
employing these methods. In particular, the storage and computation costs of the policy scale expo- 
nentially with the number of state variables. Possible strategies here involve the application of sparse 
grid strategies for discretizing the state variables (Xiu, 2009) or - more heuristically - intensive Monte 
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Carlo methods to simulate the state variables forward combined with machine learning methods to 
estimate control strategies. It may also be possible to pre-compute control policies only for high- 
probability regions of the state space and employ techniques of approximate dynamic programming 
(Powell, 2007). 

Further extensions involve alternative targets for the control policy. Thorbergsson and Hooker 
(2012) explores maximizing the Fisher Information for a partially observed Markov decision process; 
although this exacerbates the computational difficulties discussed above. Multiple parameters of in- 
terest can be readily accommodated by maximizing the trace or determinant of the Fisher Information 
matrix. We have dealt with parameter uncertainty before the experiment by averaging the Fisher In- 
formation over the prior within our objective. Alternative strategies such as maximizing the minimum 
Fisher Information in a range of parameters can be implemented within the numerical control strat- 
egy. We have also not experimented with updating our prior as the experiment progresses; where the 
optimal control policy depends strongly on the system parameters this may be expected to produce 
significant improvements. 
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stages of this work, and Xueying Wang for pointing out the ML model with membrane channel noise. 
This work was conducted as part of the Statistical and Applied Mathematical Sciences Institute's 
Program on Stochastic Processes, National Science Foundation under grant DMS-0907927 (KL), 
DEB-0813743, CMG-0934735 and DMS-1053252 (GH). 

A ML Model Parameters 

Here we briefly summarize the details of the ML model used in this paper. The interested reader is 
referred to (Ermentrout and Rinzel, 1998; Terman and Ermentrout, 2010) for more details. 
Recall the ML equations 

C m V = I(t) -gieak ■ (V - Uleak) - 9K W ■ (v - V K ) - g Ca m^u) • (V - «Ca) 
W = (p ■ (woo(v) - w)/t w {v) . 

As explained in Sect. 5, the ML model tracks the membrane voltage v and a gating variable w. The 
constant C m is the membrane capacitance, <fi a timescale parameter, and I(t) an injected current. 

Spike generation depends crucially on the presence of voltage-sensitive ion channels that are 
permeable only to specific types of ions. The ML equations include just two ionic currents, here 
denoted calcium and potassium. The voltage response of ion channels is governed by the «;-equation 
and the auxiliary functions Woo , t w , and TOqo , which have the form 

2 [l + tanh ((f - vi)/v2)] , 
1/C08h((v-V3)/(2V A )) , 
2 [l + tanh ((f - vs)/v^)] . 

The function moo(v) models the action of the relatively fast calcium ion channels; vca, is the "rever- 
sal" (bias) potential for the calcium current and gc a the corresponding conductance. The gating vari- 
able w and the functions t w (v) and u>oo(v) model the dynamics of slower-acting potassium channels, 



T w (v) = 
Woo(v) = 



20 



Parameter 


Value 


Parameter 


Value 


h 


95 


VK 


-84.0 


Cm 


20.0 


^leak 


-60.0 


9Ca 


4.41498308M 


WCa 


120.0 


8.0 


V\ 


-1.2 


9K 


V 2 


18.0 


9lcak 


2.0 


V3 


2.0 


<f> 


0.04 


V4 


30.0 



Table 4. ML parameter values used in this paper. (*) The "target" value of gc a is randomly generated from 
a certain range. 

with its own reversal potential vk and conductance gx ■ The constants ui ea k and gieak characterize the 
"leakage" current that is present even when the neuron is in a "quiescent" state. The forms of , 
t w , and Woo (as well as the values of the v£) can be obtained by fitting data, or reduction from more 
biophysically-faithful models of Hodgkin-Huxley type (see, e.g., (Terman and Ermentrout, 2010)). 

The precise parameter values used in this paper are summarized in Table 4. These are obtained 
from (Terman and Ermentrout, 2010). Throughout, the noise intensities are /3 V = 1, f3 w = 0.1. 
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